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DETAILED ACTION 

Response to Arguments 

1 . Applicant's arguments, see Pre- Appeal Request for Review, filed March 9, 2006, with 
respect to the rejection(s) of claim(s) 1-8, 10-16, 32-33, and 35-36 under 35 U.S.C 103 have 
been fully considered and are persuasive. Therefore, the rejection has been withdrawn. 
However, upon further consideration, a new ground(s) of rejection is made in view of Roth (US 
Patent Application Publication No. 2005/0038657). 

2. Applicant's arguments with respect to claims 9, 17-31, 34 and 37 have been fully 
considered but they are not persuasive. 

In response to applicant's argument that the examiner's conclusion of obviousness is 
based upon improper hindsight reasoning, it must be recognized that any judgment on 
obviousness is in a sense necessarily a reconstruction based upon hindsight reasoning. But so 
long as it takes into account only knowledge which was within the level of ordinary skill at the 
time the claimed invention was made, and does not include knowledge gleaned only from the 
applicant's disclosure, such a reconstruction is proper. See In re McLaughlin, 443 F.2d 1392, 
170 USPQ 209 (CCPA 1971). 

In response to applicant's argument that there is no suggestion to combine the references, 
the examiner recognizes that obviousness can only be established by combining or modifying the 
teachings of the prior art to produce the claimed invention where there is some teaching, 
suggestion, or motivation to do so found either in the references themselves or in the knowledge 
generally available to one of ordinary skill in the art. See In re Fine, 837 F.2d 1071, 5 
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USPQ2d 1596 (Fed. Cir. 1988) and In re Jones, 958 F.2d 347, 21 USPQ2d 1941 (Fed. Cir. 
1992). In this case, Dionne specifically teaches that the system is useful in assisting individuals 
with learning disabilities or severe visual impairments, and one of ordinary skill would clearly 
recognize the desirability of providing such individuals with text editing assistance so as to have 
a reading machine spell a word and provide an text-to-speech output of the word Further, the 
Examiner contends, the fact that applicant has recognized another advantage which would flow 
naturally from following the suggestion of the prior art cannot be the basis for patentability when 
the differences would otherwise be obvious. See Ex parte Obiaya, 227 USPQ 58, 60 (Bd. Pat. 
App. & Inter. 1985). 

Claim Rejections - 35 USC§103 
The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

3. Claims 1-8, 10-16, 32-33, and 35-36 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Mitchell (US Patent No. 5,799,273) in view of Roth et al (US Patent 
Application Publication No. 2005/0038657). 

4. Regarding claims 1 and 35, Mitchell discloses a system for relating words in an audio file 
to words in a text file, comprising: retrieving a text file comprising a plurality of textual words 
(col. 6, lines 20-29); generating an audio file comprising a plurality of audible words based on 
the text file (col. 6, lines 9-19); storing information relating each audible word to a 



Application/Control Number: 1 0/020, 1 02 Page 4 

Art Unit: 2626 

corresponding textual word (col. 6, lines 48-65); and an electronic marker within the audio file 
that indicates the position of the audible word within the text file (abstract; col. 9, lines 13-25 in 
which Mitchell discloses the user can delete and/or insert text and the recognition interface 
updates the links between the recognized word and the associated audio components such that 
link data is amended to indicate the correct character position of the word in the text). 

Mitchell does not teach the audio file is transmitted or available to a user of a 
telecommunications device. Roth discloses a combined speech recognition and text-to-speech 
system for use in a cellular telephone, in which text-to-speech (TTS) generation is used in 
conjunction with large vocabulary speech recognition to say words selected by the speech 
recognizer. TTS or recorded audio can be used to say both recognized text and the names of 
recognized commands after their recognition. The TTS can repeat text recognized by the 
speech recognition after each of a succession of end of utterance detections. A user can move a 
cursor back or forward in recognized text, and the TTS can speak one or more words at the 
cursor location after each such move. The speech recognition can be used to produces a choice 
list of possible recognition candidates and the TTS can be used to provide spoken output of one 
or more of the candidates on the choice list. Roth suggests that such a system provides for an 
effective large- vocabulary speech recognition that is used on portable computers that is capable 
of providing a user interface that makes it easier and faster to create, edit, and use speech 
recognition on such devices. 

It would have been obvious to one of ordinary skill at the time of the invention to modify 
the system Mitchell to provide the linked audio and text data to users of a plurality of 
computing devices, as suggested by Roth for the purpose of providing an effective large- 
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vocabulary speech recognition that is used on portable computers that is capable of providing a 
user interface that makes it easier and faster to create, edit, and use speech recognition on such 
devices, as also suggested by Roth. 

Regarding claim 2, Mitchell discloses the textual words comprise ASCII text (col. 5, lines 

59-67). 

Regarding claim 3, Mitchell discloses the audio file is stored in the form of a WAV file 
(col. 6, lines 9-29; col. 13, lines 26-30). 

Regarding claim 4, Mitchell discloses the information comprises voice tags embedded in 
the audio file (col. 7, lines 1-30). 

Regarding claim 5, Mitchell discloses the information comprises a file map relating a 
location of each textual word within the text file to a location of the corresponding audible word 
in the audio file (col. 6, line 48 to col. 8, line 3). 

Regarding claims 6 and 36, Mitchell discloses the method steps are performed by login 
embodied in a computer readable medium (col. 4, line 66 to col. 5, line 36). 

Regarding claims 7, 15, and 32, Mitchell discloses a system for relating words in an 
audio file to words a text file, comprising: retrieving a text file comprising a textual word (col. 6, 
lines 20-29); generating an audible word corresponding the textual word (col. 6, lines 9-19); 
storing the audible word in an audio (col. 6, lines 9-29; col. 13, lines 26-30); storing a file map, 
the file map comprising: a first location locating audible word within the audio file (Figures 3-4; 
col. 6, line 48 to col. 7, line 30); and a second location locating the textual word within the text 
file (Figures 3-4; col. 6, line 48 to col. 7, line 30). 
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Mitchell does not teach the audio file is transmitted or available to a user of a 
telecommunications device. Roth discloses a combined speech recognition and text-to-speech 
system for use in a cellular telephone, in which text-to-speech (TTS) generation is used in 
conjunction with large vocabulary speech recognition to say words selected by the speech 
recognizer. TTS or recorded audio can be used to say both recognized text and the names of 
recognized commands after their recognition. The TTS can repeat text recognized by the 
speech recognition after each of a succession of end of utterance detections. A user can move a 
cursor back or forward in recognized text, and the TTS can speak one or more words at the 
cursor location after each such move. The speech recognition can be used to produces a choice 
list of possible recognition candidates and the TTS can be used to provide spoken output of one 
or more of the candidates on the choice list. Roth suggests that such a system provides for an 
effective large-vocabulary speech recognition that is used on portable computers that is capable 
of providing a user interface that makes it easier and faster to create, edit, and use speech 
recognition on such devices. 

It would have been obvious to one of ordinary skill at the time of the invention to modify 
the system Mitchell to provide the linked audio and text data to users of a plurality of computing 
devices, as suggested by Roth for the purpose of providing an effective large-vocabulary speech 
recognition that is used on portable computers that is capable of providing a user interface that 
makes it easier and faster to create, edit, and use speech recognition on such devices, as also 
suggested by Roth. 

Regarding claims 8, 16, and 33, Mitchell discloses repeating the steps the method 
plurality of textual words in the text file (col. 5, line 59 to col. 8, line 3; Figures 3-4). 
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Regarding claim 10, Mitchell discloses a system for relating words in an audio file to 
words in a text file, comprising: retrieving a text file comprising a plurality of textual words (col. 
6, lines 20-29); generating an audible word corresponding to each textual word, each audible 
word comprising media stream packets (col. 6, lines 9-29); and playing the audible words to a 
user in real time as the audible words are generated (col. 8, line 52 to 10, line 2); and during the 
playing of the audible words, determining a current textual word corresponding audible word 
currently being played (col. 8, line 52 to col. 10, line 2). 

Mitchell does not teach the audio file is transmitted or available to a user of a 
telecommunications device. Roth discloses a combined speech recognition and text-to-speech 
system for use in a cellular telephone, in which text-to-speech (TTS) generation is used in 
conjunction with large vocabulary speech recognition to say words selected by the speech 
recognizer. TTS or recorded audio can be used to say both recognized text and the names of 
recognized commands after their recognition. The TTS can repeat text recognized by the 
speech recognition after each of a succession of end of utterance detections. A user can move a 
cursor back or forward in recognized text, and the TTS can speak one or more words at the 
cursor location after each such move. The speech recognition can be used to produces a choice 
list of possible recognition candidates and the TTS can be used to provide spoken output of one 
or more of the candidates on the choice list. Roth suggests that such a system provides for an 
effective large- vocabulary speech recognition that is used on portable computers that is capable 
of providing a user interface that makes it easier and faster to create, edit, and use speech 
recognition on such devices. 
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It would have been obvious to one of ordinary skill at the time of the invention to modify 
the system Mitchell to provide the linked audio and text data to users of a plurality of computing 
devices, as suggested by Roth for the purpose of providing an effective large-vocabulary speech 
recognition that is used on portable computers that is capable of providing a user interface that 
makes it easier and faster to create, edit, and use speech recognition on such devices, as also 
suggested by Roth. 

Regarding claim 11, Mitchell discloses the textual words comprise ASCII text (col. 5, 
lines 59-67). 

Regarding claim 12, Mitchell discloses initializing a counter identifying textual words 
within the text file (col. 6, line 48 to col. 7, line 30); and incrementing the counter after each 
audible word is played (col. 6, line 48 to col. 7, line 30); wherein the step of determining 
comprises identifying the current textual word using the counter (col. 6, line 48 to col. 7, line 
30). 

Regarding claim 13, Mitchell discloses storing information about the audible word, the 
information comprising: an identifier for the textual word corresponding the audible word (col. 
6, line 48 to col. 8, line 3); and a time at which the audible word was played (col. 6, line 48 to 
col. 8, line 3; Figures 3-4). 

Regarding claim 14, Mitchell discloses the method steps are performed by login 
embodied in a computer readable medium (col. 4, line 66 to col. 5, line 36). 
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5. Claims 9, 17-31, 34 and 37 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Mitchell in view of Dionne (US Patent No. 6,068,487) and further in view of Frulla et al 
(US Patent No. 6,424,357). 

6. Regarding claims 9, 17-31, 34 and 37, Mitchell discloses a system which provides a user 
interface for relating words in an audio file to words a text file, comprising: retrieving a text file 
comprising a textual word (col. 6, lines 20-29); generating an audible word corresponding the 
textual word (col. 6, lines 9-19); storing the audible word in an audio (col. 6, lines 9-29; col. 13, 
lines 26-30); storing a file map, the file map comprising: a first location locating audible word 
within the audio file (Figures 3-4; col. 6, line 48 to col 7, line 30); and a second location locating 
the textual word within the text file (Figures 3-4; col. 6, line 48 to col. 7, line 30). 

Mitchell does not teach that the system identifies an audible word to be spelled in 
response to the command to spell; identifies a textual word in a text file corresponding to the 
audible word to be spelled; and audibly spell the textual word. Dionne teaches a method for 
having a reading machine spell a word, which includes retrieving a word to be spelled, 
displaying letters of the word, spelling the word and provide an text-to-speech output of the word 
(col. 3, lines 8-34). Dionne teaches that the system is useful in assisting individuals with 
learning disabilities or severe visual impairments. 

It would have been obvious to one of ordinary skill at the time of the invention to modify 
the system of Mitchell to provide the spelling of words in the text, to aid in the editing of 
recognized text and in the correcting of recognition errors, for the purpose of assisting 
individuals with visual impairments with editing of text. 
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Mitchell and Dionne do not teach the command input to the system is via a voice 
command. However, implementation of voice commands to allow for system functionality and 
control similar to that of hand-controlled input devices was well known in the art. 

Frulla discloses a voice input system has a microphone coupled to a computing device, 
with the computing device typically operating a computer software application. A user speaks 
voice commands into the microphone, with the computing device operating a voice command 
module that interprets the voice command and causes the graphical or non-graphical application 
to be commanded and controlled consistent with the use of a physical mouse (Figures 2-3; col. 
4, lines 57-64), and specifically teaches the system is advantageous in environments in which it 
is inconvenient or impractical to use a mouse, and thereby making the user interface more 
convenient and efficient for a user to input information and commands. 

It would have been obvious to one of ordinary skill at the time of the invention to modify 
the system of Mitchell to provide the spelling of words in the text, to aid in the editing of 
recognized text and in the correcting of recognition errors and to further provide voice command 
control, as suggested by Frulla, for the purpose of making the user interface more convenient and 
efficient for a user to input information and commands in situations in which using a physical 
mouse is impractical or cumbersome. 
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7. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Angela A. Armstrong whose telephone number is 571-272-7598. 
The examiner can normally be reached on Monday-Thursday 1 1 :30-8:00 PM. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, David Hudspeth can be reached on 571-272-7843. The fax phone number for the 
organization where this application or proceeding is assigned is 571-273-8300. 

Please note the change in art unit designation for the examiner from old art unit "2654" to 
new art unit "2626." 

Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would 
like assistance from a USPTO Customer Service Representative or access to the automated 
information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 




Angela A Armstrong 
Primary Examiner 
Art Unit 2626 



AAA 

June 26, 2006 



